TB-WPRO: Title-Block Based Web Page Reorganization
نویسندگان
چکیده
For cell phone users and blind people using non-visual browsers, browsing Web by common browsers is quite inefficient due to the problem of information overload. This paper presents the TB-WPRO (TitleBlock based Web Page Re-Organization) method, which hierarchically segments web pages into blocks using visual and layout information reflecting the web designers’ intent. TB-WPRO segments the web pages with a clear goal to extract self-described title blocks. To reorganize web pages, the segmentation result is transformed to a serial of small web pages that could be easily accessed. Compared to current methods, the proposed approach obtains a promising segmentation result where blocks are visually and semantically consistent with original web pages. DOI: 10.4018/978-1-4666-2645-4.ch007
منابع مشابه
Title-Block Based Web Page Reorganization
For cell phone users and blind people using non-visual browsers, browsing Web by common browsers is quite inefficient due to the problem of information overload. This paper presents the TB-WPRO (Title-Block based Web Page Re-Organization) method, which hierarchically segments web pages into blocks using visual and layout information reflecting the web designers’ intent. TB-WPRO segments the web...
متن کاملTB - WPRO : Title - Block Based Web
For cell phone users and blind people using non-visual browsers, browsing Web by common browsers is quite inefficient due to the problem of information overload. This paper presents the TB-WPRO (TitleBlock based Web Page Re-Organization) method, which hierarchically segments web pages into blocks using visual and layout information reflecting the web designers’ intent. TB-WPRO segments the web ...
متن کاملA Web Page Segmentation Method by using Headlines to Web Contents as Separators and its Evaluations
In this paper, we describe a Web page segmentation method based on title blocks and show its evaluation. Title blocks are minimum blocks that function as headlines for specific Web content. A typical Web page consists of multiple elements with different types of features, such as main content, navigation panels, copyright and privacy notices, and advertisements. Web page segmentation is the div...
متن کاملUsing Document Structure on Retrieving Webpages at the Web-CLEF 2006
We present a report on our participation in the mixed monolingual web task of the 2006 Cross-Language Evaluation Forum (CLEF). We compared the result of web page retrieval based on the page content, page title, and anchor page. The retrieval effectiveness for the combination of page content, page title, and anchor texts was better than that of the combination of page title and page title only. ...
متن کاملUsing Web Page Titles to Rediscover Lost Web Pages
Titles are denoted by the TITLE element within a web page. We queried the title against the the Yahoo search engine to determine the page’s status (found, not found). We conducted several tests based on elements of the title. These tests were used to discern whether we could predict a pages status based on the title. Our results increase our ability to determine bad titles but not our ability t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJAPUC
دوره 3 شماره
صفحات -
تاریخ انتشار 2011